Recurrent neural network(RNN) has been broadly applied to natural languageprocessing(NLP) problems. This kind of neural network is designed for modelingsequential data and has been testified to be quite efficient in sequentialtagging tasks. In this paper, we propose to use bi-directional RNN with longshort-term memory(LSTM) units for Chinese word segmentation, which is a crucialpreprocess task for modeling Chinese sentences and articles. Classical methodsfocus on designing and combining hand-craft features from context, whereasbi-directional LSTM network(BLSTM) does not need any prior knowledge orpre-designing, and it is expert in keeping the contextual information in bothdirections. Experiment result shows that our approach gets state-of-the-artperformance in word segmentation on both traditional Chinese datasets andsimplified Chinese datasets.
展开▼